Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
Lecture 6
A tibble is a new way to store data in R in a tabular format. There are some slight (but important) differences between tibbles and data frames:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
# A tibble: 150 × 5
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
<dbl> <dbl> <dbl> <dbl> <fct>
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
7 4.6 3.4 1.4 0.3 setosa
8 5 3.4 1.5 0.2 setosa
9 4.4 2.9 1.4 0.2 setosa
10 4.9 3.1 1.5 0.1 setosa
# ℹ 140 more rows
readr to import external data, remember that the result will always be a tibble.as.data.frame().library(ggplot2)mpg dataset. Let’s load it into memory and read the associated help page.Now, let’s create our first plot:
What relationship exists between hwy (miles per gallon) and displ (engine displacement)?
ggplot, creates an empty plot.geom_point function adds a layer to the empty plot representing the scatterplot of hwy vs displ.aes function, within which we specify which values to map to the x-axis and the y-axis.In general, a plot in ggplot2 is created using the following command:
GEOM_FUNCTION is a function that creates a layer, and MAPPINGS are the parameters we pass to the function. As the analysis becomes more complex, we will continue to add more layers.
GEOM_FUNCTION (like geom_point), there are various types of aesthetics that can be modified to customize many aspects of each layer. A complete list is available by reading the help associated with each function.geom_point function. We see that the possible aesthetics are not just x and y (the coordinates of the point), but also colour, size, and shape.Now, let’s create our second plot:
How does the relationship between hwy and displ change with different vehicle types?
We can also associate the class variable with different characteristics of a point, such as its size:
or its shape:
GEOM_FUNCTIONS besides geom_point that are used to create different types of plots (e.g., histograms, bar charts, error line charts…).GEOM_FUNCTIONS require one or more aesthetics, but not all can work with the same aesthetics.geom_smooth to create a sort of trend line using a different type of geometry than geom_point`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Let’s try to modify the linetype:
`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
This will give us a trend line for the different values of drv.
Let’s also modify the color of the curves:
ggplot2 is the ability to easily represent two or more geometries on the same plot: ggplot(data = mpg) +
geom_smooth(mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(x = displ, y = hwy))`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
ggplot function and the unique aesthetics inside the GEOM_FUNCTION: ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point(mapping = aes(col = class)) +
geom_smooth()`geom_smooth()` using method = 'loess' and formula = 'y ~ x'
ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) +
geom_point(size = 3, alpha = 0.7) + # Larger points with transparency
geom_smooth(se = FALSE, linetype = "dashed", linewidth = 1.2,span=1.5) + # Smoother lines without confidence interval
scale_color_brewer(palette = "Set1") + # Use a colorblind-friendly palette
labs(
title = "Fuel Efficiency vs Engine Displacement",
subtitle = "Relationship between engine size and highway fuel efficiency across car types",
x = "Engine Displacement (liters)",
y = "Highway Fuel Efficiency (mpg)",
color = "Vehicle Class"
) +
theme_minimal(base_size = 15) + # Clean minimalistic theme
theme(
plot.title = element_text(face = "bold", size = 18, hjust = 0.5),
plot.subtitle = element_text(size = 14, hjust = 0.5),
legend.position = "bottom", # Move legend to the bottom
legend.title = element_text(size = 12),
legend.text = element_text(size = 10),
legend.background = element_rect(fill = "gray95", color = NA)
)esquisse, available at esquisse GitHub.install.packages("esquisse"). Launch it by clicking on the ‘ggplot2’ builder in the drop-down menu called Plug-in, available under the menu bar.